Ó³ Ÿ , º 5(203) Ä1019. Š Œ œ ƒˆˆ ˆ ˆŠ

Size: px
Start display at page:

Download "Ó³ Ÿ , º 5(203) Ä1019. Š Œ œ ƒˆˆ ˆ ˆŠ"

Transcription

1 Ó³ Ÿ , º 5(203) Ä1019 Š Œ œ ƒˆˆ ˆ ˆŠ INTEGRATION OF PANDA WORKLOAD MANAGEMENT SYSTEM WITH SUPERCOMPUTERS K. De a,s.jha b, A. A. Klimentov c, d, T. Maeno c, R. Yu. Mashinistov d,1, P. Nilsson c, A. M. Novikov d, D. A. Oleynik a,e, S. Yu. Panitkin c, A. A. Poyda d, K.F.Read f, E. A. Ryabinkin d,a.b.teslyuk d, V. E. Velikhov d, J.C.Wells f, T. Wenaus c a University of Texas at Arlington, Arlington, TX, USA b Rutgers University, Piscataway, NJ, USA c Brookhaven National Laboratory, Upton, NY, USA d National Research Center Kurchatov Instituteª, Moscow e Joint Institute for Nuclear Research, Dubna f Oak Ridge National Laboratory, Oak Ridge, TN, USA The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientiˇc explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe and were recently credited for the discovery of the Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multipetabyte data processing challenge, the ATLAS experiment relies on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientiˇc breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3+ petaflops, next LHC data taking runs will require more resources than grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in the United States, Europe, and Russia (in particular, with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), Supercomputer at the National Research Center Kurchatov Instituteª, IT4 in Ostrava, and others). The current approach utilizes a modiˇed PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single-threaded workloads in parallel on Titan's multicore worker nodes. This implementation was tested with a variety of Monte Carlo workloads on several supercomputing platforms. We will present our current accomplishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility's infrastructure for high energy and nuclear physics, as well as other data-intensive science applications, such as bioinformatics and astroparticle physics. PACS: c 1 Ruslan.Mashinistov@cern.ch

2 Integration of PanDA Workload Management System with Supercomputers 1011 INTRODUCTION The ATLAS experiment [1] at the Large Hadron Collider (LHC) is designed to explore the fundamental properties of matter for the next decade at the highest energy ever achieved at a laboratory. Since LHC became operational in 2009, the experiment has produced and distributed hundreds of petabytes of data worldwide among the O(100) heterogeneous computing centres of the Worldwide LHC Computing Grid (WLCG) [2]. Thousands of physicists are engaged in analyzing these data. The Large Hadron Collider returns to operations after a two-year ofine period, Long Shutdown 1, which allowed thousands of physicists worldwide to undertake crucial upgrades to the already cutting-edge particle accelerator. The LHC now begins its second multiyear operating period, Run 2, which will take the collider through 2018 with collision energies nearly double than those of Run 1. In other words, Run 2 will nearly double the energies that allowed researchers to detect the long-sought Higgs Boson in The WLCG computing sites are usually dedicated clusters speciˇcally set up to meet the needs of the LHC experiments. More than one million grid jobs run on the distributed computing sites all over the world per day on more than 200,000 CPU cores. The WLCG infrastructure will be sufˇcient for the planned analysis and data processing, but it will be insufˇcient for Monte Carlo (MC) production and any extra activities. Additional computing and storage resources are therefore required. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model, include additional resources, i.e., the opportunistic use of supercomputers and high-performance computing clusters (HPCs). PANDA WORKLOAD MANAGEMENT SYSTEM A sophisticated Workload Management System (WMS) is needed to manage the distribution and processing of huge amount of data. PanDA (Production and Distributed Analysis) WMS [3] was designed to meet ATLAS requirements for a data-driven workload management system for production and distributed analysis processing capable of operating at the LHC data processing scale. PanDA has a highly scalable architecture. Scalability has been demonstrated in ATLAS through the rapid increase in usage over the past several years of operations and is expected to meet the continuously growing number of jobs over the next decade. Currently, as of 2015, PanDA WMS manages processing of over one million jobs per day on the ATLAS grid. PanDA was designed to have the exibility to adapt to emerging computing technologies in processing, storage, networking, as well as the underlying software stack (middleware). This exibility has also been successfully demonstrated through the past six years of evolving technologies adapted by computing centers in ATLAS, which span many continents and yet are seamlessly integrated into PanDA. PanDA is a pilot-based [4, 5] WMS. In the PanDA job lifecycle, pilot jobs (Python scripts that organize workload processing on a worker node) are submitted to compute sites. When these pilot jobs start on a worker node, they contact a central server to retrieve a real payload (i.e., an end-user job) and execute it. Using these pilot-based workows help to improve job reliability, optimize resource utilization, allow for opportunistic resources usage, and mitigate many of the problems associated with the inhomogeneities found on the grid. Extending

3 1012 De K. et al. PanDA beyond the grid will further expand the potential user community and the resources available to them. The JEDI (Job Execution and Deˇnition Interface) extension to PanDA adds a new functionality to the PanDA server to dynamically break down the tasks based on optimal usage of available processing resources. With this new capability, the tasks can now be broken down at the level of either individual events or event clusters or ensembles, as opposed to the traditional ˇle-based task granularity. This allows the recently developed ATLAS Event Service to dynamically deliver to a compute node only that portion of the input data which will be actually processed there by the payload application (simulation, reconstruction, and/or data analysis), thus avoiding costly pre-staging operations for entire data ˇles. The Event Service leverages modern networks for efˇcient remote data access and highly scalable object store technologies for data storage. It is agile and efˇcient in exploring diverse, distributed, and potentially short-lived (opportunistic) resources: conventional resourcesª (grid), supercomputers, commercial clouds, and volunteer computing. EXTENDING PANDA TO SUPERCOMPUTERS The modern High Performance Computing (HPC) platforms encompass a broad spectrum of computing facilities, ranging from small-scale interconnected clusters to the largest supercomputers in the world. They are rich sources of CPUs, some claiming more cores than the entire ATLAS grid. The HPC machines are built to execute large-scale parallel, computationally intensive, workloads with high efˇciency. They provide high-speed interconnects between worker (compute) nodes and provide facilities for low latency internode communications. For the ATLAS experiment (or we should say WLCG) to make effective and impactful use of HPCs, it is not a requirement that HPCs are able to run any possible task, nor is it relevant how many kinds of job types can be run. What matters to ATLAS is the total number of cycles that can be ofoaded from the traditional grid resources. The standard ATLAS workow is not well-adapted for HPCs due to several complications: typically the worker node setup is ˇxed, with no direct wide-area network connections, the amount of RAM per core can be quite limited, and in many cases a customized operating system is used along with a specialized software stack. The following is a nonexhaustive list of some of the known problems with suggested solutions. Communication with the PanDA server typically is not possible on the worker node level. Hence, the payload must be fully deˇned in advance, and all communications with the PanDA server will be done from the HPC front-end nodes where wide-area network connections are allowed. The central software repository is not always accessible from within the HPC so it should be synchronized to a shared ˇle system instead. This is also true concerning the usage of local copy of the database release ˇle instead of connecting to the database service. The network throughput to and from the HPC is limited which can make jobs with large input/output challenging. The usage of the CPU-intensive event generation and Monte Carlo simulations is the ideal case for the HPCs. Finally, the usage of the Storage Element (SE) from a close Tier-1/2 for stage-in and stage-out could be a perfect solution, as HPCs typically do not provide a SE. Supercomputing centers in the USA, Europe, and Asia, in particular, the Titan supercomputer [6] at Oak Ridge Leadership Computing Facility (OLCF), the National Energy Research

4 Integration of PanDA Workload Management System with Supercomputers 1013 Supercomputing Center (NERSC) in the USA, Ostrava supercomputing center in Czech Republic, and Kurchatov Instituteª in Russia (NRC KI) are now integrated within the ATLAS workow via the PanDA WMS. This will make Leadership Computing facilities of much greater utility and impact for HEP computing in the future. Developments in these directions are presently underway in the LHC experiments. PANDA ON TITAN AT OAK RIDGE LEADERSHIP COMPUTING FACILITY The Titan supercomputer, current number two (number one until June 2013) on the Top 500 list [7] is located at the Oak Ridge Leadership Computing Facility within Oak Ridge National Laboratory, USA. It has theoretical peak performance of 29 petaflops. Titan was the ˇrst large-scale system to use a hybrid CPUÄGPU architecture that utilizes worker nodes with both AMD 16-core Opteron 6274 CPUs and NVIDIA Tesla K20 GPU accelerators. It has 18,688 worker nodes with a total of 299,008 CPU cores. Each node has 32 GB of RAM and no local disk storage, though a RAM disk can be set up if needed, with a maximum capacity of 16 GB. Worker nodes use Cray's Gemini interconnect for inter-node MPI messaging, but have no connection to the wide-area network. Titan is served by the shared Lustre ˇlesystem that has 32 PB of disk storage and by the HPSS tape storage that has 29 PB capacity. Titan's worker nodes run Compute Node Linux, which is a run time environment based on the Linux kernel derived from SUSE Linux Enterprise Server. Taking advantage of its modular and extensible design, the PanDA pilot code and logic have been enhanced with tools and methods relevant for HPC. The pilot runs on Titan's front-end nodes, which allows it to communicate with the PanDA server, since front-end nodes have connectivity to the Internet. The interactive front-end machines and the worker nodes use a shared ˇle system, which makes it possible for the pilot to stage-in input ˇles required by the payload and stage-out the produced output ˇles at the end of the job. The ATLAS Tier-1 computing center at Brookhaven National Laboratory is currently used for data transfer to and from Titan, but in principle that can be any grid site. The pilot submits ATLAS payloads to the worker nodes using the local batch system (PBS) via the SAGA (Simple API for grid Applications) interface [8]. Figure 1 shows the schematic diagram of PanDA components on Titan. The majority of experimental high-energy physics workloads do not use Message Passing Interface (MPI). They are designed around event level parallelism and thus are executed on the grid independently. Typically, detector simulation workloads can run on a single compute node using multiprocessing. For running such workloads on Titan, we developed an MPI wrapper to launch multiple instances of single-node workloads simultaneously. MPI wrappers are typically workload speciˇc since they are responsible for setup of workload-speciˇc environment, organization of per-rank worker directories, rank-speciˇc data management, input-parameters modiˇcation when necessary, and cleanup on exit. The wrapper scripts are what the pilot actually submits to a batch queue to run on Titan. The pilot reserves the necessary number of worker nodes at submission time and at run time a corresponding number of copies of the wrapper script will be activated on Titan. Each copy will know its MPI rank (an index that runs from zero to a maximum number of nodes or script copies), as well as the total number of ranks in the current submission. When activated on worker nodes, each copy of the wrapper script, after completing the necessary preparations, will start the actual payload as a subprocess and

5 1014 De K. et al. will wait until its completion. In other words, the MPI wrapper serves as a containerª for non-mpi workloads and allows us to effciently run unmodiˇed grid-centric workloads on parallel computational platforms, like Titan. Leadership Computing Facilities (LCF), like Titan, are geared towards large-scale jobs by design. Time allocation on LCF machines is very competitive and large-scale projects are often preferred. This is especially true for Titan at OLCF, which was designed to be the most powerful machine in the world, capable of running extreme-scale computational projects. As a consequence, on average, about ten percent of capacity on Titan is unused due to mismatches between job sizes and available resources. The worker nodes sit idle because there are not enough of them to handle a large-scale computing job. On Titan, these ten percent correspond to estimate of 300M core hours per year. Hence, a system that can occupy those temporarily free nodes would be very valuable. It would allow the LCF to deliver more compute cycles for scientiˇc research while simultaneously improving resource utilization effciency on Titan. This offers a great possibility for PanDA to harvest these opportunistic resources on Titan. Functionality has been added to the PanDA pilot to interact with Titan's scheduler and collect information about available unused worker nodes on Titan. This allows the PanDA pilot to deˇne precisely size and duration of jobs submitted to Titan according to available free resources. One additional beneˇt of this implementation is very short wait times before the jobs execution on Titan. Since PanDA submitted jobs match currently available free resources exactly or at least very closely, they present, in majority of cases, the best solution for Titan's job scheduler to achieve maximum resource utilization, resulting in short wait times for these jobs. We note that care must be taken to manage potential contention for shared system resources, e.g., internal communication bandwidth, IO bandwidth, and access to front-end and data-transfer nodes. Fig. 1. Schematic view of PanDA interface with Titan

6 Integration of PanDA Workload Management System with Supercomputers 1015 Fig. 2. Number of ATLAS production jobs on Titan Titan was fully integrated with the ATLAS PanDA based Production and Analysis system, and now the ATLAS experiment routinely runs Monte Carlo simulation tasks there. All operations, including data transfers to and from Titan, are transparent to the ATLAS Computing Operations team and physicists. Figure 2 shows an example of the ATLAS monitoring dashboard plot of running ATLAS production jobs on Titan. INTERGRATION OF TIER-1 GRID CENTER WITH HIGH-PERFORMANCE COMPUTER AT NRC KI Pioneer project of combining Tier-1, supercomputer, and the cloud platform into a single portal at the Kurchatov Institute began in 2014 and continues to the present day. Now the portal is aimed to provide an interface to run jobs at the Tier-1 grid and supercomputer using common storage. The portal is used for ATLAS production and user analysis tasks and also for biology studies for genome sequencing analysis. The Tier-1 facility at NRC KI is the part of WLCG and it will process and store up to 10% of total data obtained from ALICE, ATLAS, and LHCb experiments. The second-generation supercomputer HPC2 [9] based on Intel(R) Xeon(R) E5450@3.00GHz with peak performance TFLOPS. Currently, 32 worker nodes (256 cores) are provided for ATLAS Production tasks, two worker nodes (16 cores) are provided for ATLAS user analysis jobs. OpenStack based cloud platform having performance 1.5 TFLOPS and providing 16 nodes, 256 cores, 512 GB RAM, 60 TB at the storage system and InˇniBand connectivity. The integration schema of PanDA WMS with Kurchatov's supercomputer is shown in Fig. 3. Local APF (Auto Pilot Factory) Å an independent subsystem that manages the delivery of pilotsª to HPC's worker nodes via a number of schedulers serving the sites at which PanDA operates Å was installed to launch ATLAS jobs.

7 1016 De K. et al. Fig. 3. Integration of Tier-1 grid and supercomputer using PanDA

8 Integration of PanDA Workload Management System with Supercomputers 1017 The integration was done following the basic WLCG approach where one pilot is running on one core. The worker nodes of the supercomputer have access to the Internet. They have direct acces to the data, SW, and to the PanDA server. To support the ATLAS workow, the CVMFS (CERN Virtual Machine File System) was installed on the worker nodes. The CVMFS provides access to the full set of ATLAS SW releases. Also, the local PanDA instance was installed in NRC KI for biology studies. The local instance consists of the following main components: server (MySQL), Auto Pilot Factory, monitor, and database server. Auto Pilot Factory is conˇgured in a way that it works with standard pilots to run ATLAS jobs derived from production server at CERN and also operates with the pilot adopted to HPC to run non-atlas jobs derived from the local PanDA server at NRC KI. The PanDA monitor performs detailed monitoring of the jobs for its status diagnostics. The HPC-pilot project was initiated for Titan supercomputer and successfully adopted for Kurchatov Institute's supercomputer HPC2. The HPC-pilot provides the ability to run MPI parallel jobs and to move data to/from the supercomputer. The HPC-pilot runs on the HPC interactive node and communicates with the local batch scheduler to manage jobs over the available CPUs. The implementation of the HPC-pilot at the NRC KI is used to run biology jobs that analyze the data obtained in genome sequencing in collaboration with the Genomics laboratory of the Kurchatov Institute. This analysis consists in studying the ancient DNA samples using the Paleomix [10] pipeline application. This pipeline includes a number of open-source software for quick data processing of the Next Generation Sequencing (NGS). The common shared queue of supercomputer is provided to run biology jobs allocating up to 1000 available CPU cores. PANDA EVENT SERVICE AND SUPERCOMPUTERS The Event Service is a complex distributed system in which different components communicate with each other over the network using HTTP. For event processing, it uses AthenaMP, a process-parallel version of the ATLAS simulation, reconstruction and data analysis framework Athena. A PanDA pilot starts an AthenaMP application on the compute node and waits until it goes through the initialization phase and forks worker processes. After that, the pilot requests an event-based workload from the PanDA JEDI, which is dynamically delivered to the pilot in the form of event ranges. The event range is a string that, together with other information, contains positional numbers of events within the ˇle and a unique ˇle identiˇer (GUID). The pilot streams event ranges to the running AthenaMP application, which takes care of the event data retrieval, event processing and output ˇle producing (a new output ˇle for each range). The pilot monitors the directory in which the output ˇles are produced and as they appear sends them to an external aggregation facility (Object Store) for ˇnal merging. Supercomputers are one of the important deployment platforms for Event Service applications. However, on most HPC machines there is no Internet connection from compute nodes to the wide-area network. This limitation makes it impossible to run the conventional Event Service on such systems because the payload component needs to communicate with central services (e.g., job brokerage, data aggregation facilities) over the network. In summer 2014, we started to work on an HPC-speciˇc implementation of the Event Service that would leverage MPI for running on multiple compute nodes simultaneously. To

9 1018 De K. et al. speed up the development process and also to preserve all functionality already available in the conventional Event Service, we reused the existing code and implemented light-weight versions of the PanDA JEDI (Yoda, a diminutive JEDI) and the PanDA Pilot (Droid), which communicate with each other over MPI. Figure 4 shows a schematic of a Yoda application, which implements the masteräslave architecture and runs one MPI-rank per compute node. The responsibility of Rank 0 (Yoda, the master) is to send event ranges to other ranks (Droid, the slave) and to collect from them the information about the completed ranges and the produced outputs. Yoda also continuously updates event range statuses in a special table within an SQLite database ˇle on the HPC shared ˇle system. The responsibility of a Droid is to start an AthenaMP payload application on the compute node, receive event ranges from Yoda, deliver the ranges to the running payload, collect information about the completed ranges (e.g., status, output ˇle name, and location), and to pass this information back to Yoda. Yoda distributes event ranges between Droids on a ˇrst-come, ˇrst-served basis. When some Droid reports completion of an event range, Yoda immediately responds with a new range for this Droid. In this way, Droids are kept busy until all ranges assigned to the given job have been processed or until the job exceeds its time allocation and gets terminated by the batch scheduler. In the latter case, the data losses caused by such termination are minimal, because the output for each processed event range gets saved immediately in a separate ˇle on the shared ˇle system. Fig. 4. Yoda application which implements the masteräslave architecture and runs one MPI-rank per compute node

10 Integration of PanDA Workload Management System with Supercomputers 1019 CONCLUSIONS The PanDA's capability for large-scale data-intensive distributed processing has been thoroughly demonstrated in one of the most demanding big data computing environments. The layered structure of PanDA, which enables it to support a variety of middleware, heterogeneous computing systems, and diverse applications, makes PanDA also ideally suited for a common big-data processing system for many data-intensive sciences. The PanDA lowers the barrier for scientists to easily carry out their research using a variety of distributed computing systems. The LHC Run 2 will pose massive computing challenges for ATLAS. With a doubling of the beam energy and luminosity, as well as an increased need for simulated data, the data volume is expected to increase by a factor of 5Ä6 or more. Storing and processing this amount of data is a challenge that cannot be resolved with the currently existing computing resources in ATLAS. To resolve this challenge, ATLAS is exploring use of supercomputers and HPC clusters via the PanDA system. In this paper, we described a project aimed at integration of PanDA WMS with different supercomputers. The detailed information is given for Titan supercomputer at Oak Ridge Leadership Computing Facility and supercomputer HPC2 at NRC KI. The current approach utilizes modiˇed PanDA-pilot frameworks for job submission to supercomputer's batch queues and local-data management. Also, the work underway enables the use of PanDA by new scientiˇc collaborations and communities beyond LHC and even HEP. Acknowledgements. This work was funded in part by the US Department of Energy, Ofˇce of Science, High Energy Physics and Advanced Scientiˇc Computing Research under Contracts Nos. DE-SC , DE-AC02-98CH10886, and DE-AC02-06CH The NRC KI team work was funded by the Russian Ministry of Science and Education under Contract No. 14.Z Supercomputing resources at the NRC KI are supported as a part of the center for collective usage (project RFMEFI62114X0006, funded by the Russian Ministry of Science and Education). We would like to acknowledge that this research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Ofˇce of Science of the US Department of Energy under Contract No. DE-AC05-00OR REFERENCES 1. Aad J. et al. (ATLAS Collab.). The ATLAS Experiment at the CERN Large Hadron Collider // J. Instr V. 3. P. S The Worldwide LHC Computing Grid (WLCG) Maeno T. Overview of ATLAS PanDA Workload Management // J. Phys.: Conf. Ser V P Nilsson P. The ATLAS PanDA Pilot in Operation // Proc. of the 18th Intern. Conf. on Computing in High Energy and Nuclear Physics (CHEP2010). 5. Turilli M., Santcroos M., Jha S. A Comprehensive Perspective on the Pilot-Job Systems Titan at OLCF Web Page Top500 List The SAGA Framework Web Site Kurchatov Institute HPC Cluster Schubert M. et al. Characterization of Ancient and Modern Genomes by SNP Detection and Phylogenomic and Metagenomic Analysis Using PALEOMIX // Nat. Protoc V. 9, No. 5. P. 1056Ä1082.

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment

More information

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science T. Maeno, K. De, A. Klimentov, P. Nilsson, D. Oleynik, S. Panitkin, A. Petrosyan, J. Schovancova, A. Vaniachine,

More information

Ó³ Ÿ , º 5(203) Ä1050. Š Œ œ ƒˆˆ ˆ ˆŠ

Ó³ Ÿ , º 5(203) Ä1050. Š Œ œ ƒˆˆ ˆ ˆŠ Ó³ Ÿ. 2016.. 13, º 5(203).. 1046Ä1050 Š Œ œ ƒˆˆ ˆ ˆŠ JINR CLOUD INFRASTRUCTURE EVOLUTION A. V. Baranov a, N. A. Balashov a, N. A. Kutovskiy a, b,1, R. N. Semenov a, b a Joint Institute for Nuclear Research,

More information

Overview of ATLAS PanDA Workload Management

Overview of ATLAS PanDA Workload Management Overview of ATLAS PanDA Workload Management T. Maeno 1, K. De 2, T. Wenaus 1, P. Nilsson 2, G. A. Stewart 3, R. Walker 4, A. Stradling 2, J. Caballero 1, M. Potekhin 1, D. Smith 5, for The ATLAS Collaboration

More information

ATLAS Distributed Computing Experience and Performance During the LHC Run-2

ATLAS Distributed Computing Experience and Performance During the LHC Run-2 ATLAS Distributed Computing Experience and Performance During the LHC Run-2 A Filipčič 1 for the ATLAS Collaboration 1 Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia E-mail: andrej.filipcic@ijs.si

More information

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider Andrew Washbrook School of Physics and Astronomy University of Edinburgh Dealing with Data Conference

More information

AGIS: The ATLAS Grid Information System

AGIS: The ATLAS Grid Information System AGIS: The ATLAS Grid Information System Alexey Anisenkov 1, Sergey Belov 2, Alessandro Di Girolamo 3, Stavro Gayazov 1, Alexei Klimentov 4, Danila Oleynik 2, Alexander Senchenko 1 on behalf of the ATLAS

More information

Federated data storage system prototype for LHC experiments and data intensive science

Federated data storage system prototype for LHC experiments and data intensive science Federated data storage system prototype for LHC experiments and data intensive science A. Kiryanov 1,2,a, A. Klimentov 1,3,b, D. Krasnopevtsev 1,4,c, E. Ryabinkin 1,d, A. Zarochentsev 1,5,e 1 National

More information

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era to meet the computing requirements of the HL-LHC era NPI AS CR Prague/Rez E-mail: adamova@ujf.cas.cz Maarten Litmaath CERN E-mail: Maarten.Litmaath@cern.ch The performance of the Large Hadron Collider

More information

The ATLAS PanDA Pilot in Operation

The ATLAS PanDA Pilot in Operation The ATLAS PanDA Pilot in Operation P. Nilsson 1, J. Caballero 2, K. De 1, T. Maeno 2, A. Stradling 1, T. Wenaus 2 for the ATLAS Collaboration 1 University of Texas at Arlington, Science Hall, P O Box 19059,

More information

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns Journal of Physics: Conference Series OPEN ACCESS Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns To cite this article: A Vaniachine et al 2014 J. Phys.: Conf. Ser. 513 032101 View

More information

CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER

CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER V.V. Korenkov 1, N.A. Kutovskiy 1, N.A. Balashov 1, V.T. Dimitrov 2,a, R.D. Hristova 2, K.T. Kouzmov 2, S.T. Hristov 3 1 Laboratory of Information

More information

Software and computing evolution: the HL-LHC challenge. Simone Campana, CERN

Software and computing evolution: the HL-LHC challenge. Simone Campana, CERN Software and computing evolution: the HL-LHC challenge Simone Campana, CERN Higgs discovery in Run-1 The Large Hadron Collider at CERN We are here: Run-2 (Fernando s talk) High Luminosity: the HL-LHC challenge

More information

Extending ATLAS Computing to Commercial Clouds and Supercomputers

Extending ATLAS Computing to Commercial Clouds and Supercomputers Extending ATLAS Computing to Commercial Clouds and Supercomputers 1 University of Texas at Arlington Physics Dept., 502 Yates St., Room 108 Science Hall, Arlington, Texas 76019, United States E-mail: Paul.Nilsson@cern.ch

More information

The High-Level Dataset-based Data Transfer System in BESDIRAC

The High-Level Dataset-based Data Transfer System in BESDIRAC The High-Level Dataset-based Data Transfer System in BESDIRAC T Lin 1,2, X M Zhang 1, W D Li 1 and Z Y Deng 1 1 Institute of High Energy Physics, 19B Yuquan Road, Beijing 100049, People s Republic of China

More information

ATLAS Experiment and GCE

ATLAS Experiment and GCE ATLAS Experiment and GCE Google IO Conference San Francisco, CA Sergey Panitkin (BNL) and Andrew Hanushevsky (SLAC), for the ATLAS Collaboration ATLAS Experiment The ATLAS is one of the six particle detectors

More information

ATLAS Nightly Build System Upgrade

ATLAS Nightly Build System Upgrade Journal of Physics: Conference Series OPEN ACCESS ATLAS Nightly Build System Upgrade To cite this article: G Dimitrov et al 2014 J. Phys.: Conf. Ser. 513 052034 Recent citations - A Roadmap to Continuous

More information

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since

More information

AutoPyFactory: A Scalable Flexible Pilot Factory Implementation

AutoPyFactory: A Scalable Flexible Pilot Factory Implementation ATL-SOFT-PROC-2012-045 22 May 2012 Not reviewed, for internal circulation only AutoPyFactory: A Scalable Flexible Pilot Factory Implementation J. Caballero 1, J. Hover 1, P. Love 2, G. A. Stewart 3 on

More information

Overview. About CERN 2 / 11

Overview. About CERN 2 / 11 Overview CERN wanted to upgrade the data monitoring system of one of its Large Hadron Collider experiments called ALICE (A La rge Ion Collider Experiment) to ensure the experiment s high efficiency. They

More information

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Preparing GPU-Accelerated Applications for the Summit Supercomputer Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership

More information

Monte Carlo Production on the Grid by the H1 Collaboration

Monte Carlo Production on the Grid by the H1 Collaboration Journal of Physics: Conference Series Monte Carlo Production on the Grid by the H1 Collaboration To cite this article: E Bystritskaya et al 2012 J. Phys.: Conf. Ser. 396 032067 Recent citations - Monitoring

More information

Andrea Sciabà CERN, Switzerland

Andrea Sciabà CERN, Switzerland Frascati Physics Series Vol. VVVVVV (xxxx), pp. 000-000 XX Conference Location, Date-start - Date-end, Year THE LHC COMPUTING GRID Andrea Sciabà CERN, Switzerland Abstract The LHC experiments will start

More information

IEPSAS-Kosice: experiences in running LCG site

IEPSAS-Kosice: experiences in running LCG site IEPSAS-Kosice: experiences in running LCG site Marian Babik 1, Dusan Bruncko 2, Tomas Daranyi 1, Ladislav Hluchy 1 and Pavol Strizenec 2 1 Department of Parallel and Distributed Computing, Institute of

More information

BigData and Computing Challenges in High Energy and Nuclear Physics

BigData and Computing Challenges in High Energy and Nuclear Physics BigData and Computing Challenges in High Energy and Nuclear Physics Alexei Klimentov CREMLIN WP2 Workshop on BigData Management Moscow, Feb 15-16, 2017 02.03.2017 1 Outline High Energy Physics and Nuclear

More information

Unified System for Processing Real and Simulated Data in the ATLAS Experiment

Unified System for Processing Real and Simulated Data in the ATLAS Experiment Unified System for Processing Real and Simulated Data in the ATLAS Experiment Mikhail Borodin Big Data Laboratory, National Research Centre "Kurchatov Institute", Moscow, Russia National Research Nuclear

More information

DIRAC pilot framework and the DIRAC Workload Management System

DIRAC pilot framework and the DIRAC Workload Management System Journal of Physics: Conference Series DIRAC pilot framework and the DIRAC Workload Management System To cite this article: Adrian Casajus et al 2010 J. Phys.: Conf. Ser. 219 062049 View the article online

More information

PanDA: Exascale Federation of Resources for the ATLAS Experiment

PanDA: Exascale Federation of Resources for the ATLAS Experiment EPJ Web of Conferences will be set by the publisher DOI: will be set by the publisher c Owned by the authors, published by EDP Sciences, 2015 PanDA: Exascale Federation of Resources for the ATLAS Experiment

More information

Virtualizing a Batch. University Grid Center

Virtualizing a Batch. University Grid Center Virtualizing a Batch Queuing System at a University Grid Center Volker Büge (1,2), Yves Kemp (1), Günter Quast (1), Oliver Oberst (1), Marcel Kunze (2) (1) University of Karlsruhe (2) Forschungszentrum

More information

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I.

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I. System upgrade and future perspective for the operation of Tokyo Tier2 center, T. Mashimo, N. Matsui, H. Sakamoto and I. Ueda International Center for Elementary Particle Physics, The University of Tokyo

More information

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data D. Barberis 1*, J. Cranshaw 2, G. Dimitrov 3, A. Favareto 1, Á. Fernández Casaní 4, S. González de la Hoz 4, J.

More information

The JINR Tier1 Site Simulation for Research and Development Purposes

The JINR Tier1 Site Simulation for Research and Development Purposes EPJ Web of Conferences 108, 02033 (2016) DOI: 10.1051/ epjconf/ 201610802033 C Owned by the authors, published by EDP Sciences, 2016 The JINR Tier1 Site Simulation for Research and Development Purposes

More information

Batch Services at CERN: Status and Future Evolution

Batch Services at CERN: Status and Future Evolution Batch Services at CERN: Status and Future Evolution Helge Meinhard, CERN-IT Platform and Engineering Services Group Leader HTCondor Week 20 May 2015 20-May-2015 CERN batch status and evolution - Helge

More information

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk Challenges and Evolution of the LHC Production Grid April 13, 2011 Ian Fisk 1 Evolution Uni x ALICE Remote Access PD2P/ Popularity Tier-2 Tier-2 Uni u Open Lab m Tier-2 Science Uni x Grid Uni z USA Tier-2

More information

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008 CERN openlab II CERN openlab and Intel: Today and Tomorrow Sverre Jarp CERN openlab CTO 16 September 2008 Overview of CERN 2 CERN is the world's largest particle physics centre What is CERN? Particle physics

More information

Evolution of Database Replication Technologies for WLCG

Evolution of Database Replication Technologies for WLCG Journal of Physics: Conference Series PAPER OPEN ACCESS Evolution of Database Replication Technologies for WLCG To cite this article: Zbigniew Baranowski et al 2015 J. Phys.: Conf. Ser. 664 042032 View

More information

The ATLAS Software Installation System v2 Alessandro De Salvo Mayuko Kataoka, Arturo Sanchez Pineda,Yuri Smirnov CHEP 2015

The ATLAS Software Installation System v2 Alessandro De Salvo Mayuko Kataoka, Arturo Sanchez Pineda,Yuri Smirnov CHEP 2015 The ATLAS Software Installation System v2 Alessandro De Salvo Mayuko Kataoka, Arturo Sanchez Pineda,Yuri Smirnov CHEP 2015 Overview Architecture Performance LJSFi Overview LJSFi is an acronym of Light

More information

Benchmarking the ATLAS software through the Kit Validation engine

Benchmarking the ATLAS software through the Kit Validation engine Benchmarking the ATLAS software through the Kit Validation engine Alessandro De Salvo (1), Franco Brasolin (2) (1) Istituto Nazionale di Fisica Nucleare, Sezione di Roma, (2) Istituto Nazionale di Fisica

More information

HTCondor on Titan. Wisconsin IceCube Particle Astrophysics Center. Vladimir Brik. HTCondor Week May 2018

HTCondor on Titan. Wisconsin IceCube Particle Astrophysics Center. Vladimir Brik. HTCondor Week May 2018 HTCondor on Titan Wisconsin IceCube Particle Astrophysics Center Vladimir Brik HTCondor Week May 2018 Overview of Titan Cray XK7 Supercomputer at Oak Ridge Leadership Computing Facility Ranked #5 by TOP500

More information

The PanDA System in the ATLAS Experiment

The PanDA System in the ATLAS Experiment 1a, Jose Caballero b, Kaushik De a, Tadashi Maeno b, Maxim Potekhin b, Torre Wenaus b on behalf of the ATLAS collaboration a University of Texas at Arlington, Science Hall, PO Box 19059, Arlington, TX

More information

High-Energy Physics Data-Storage Challenges

High-Energy Physics Data-Storage Challenges High-Energy Physics Data-Storage Challenges Richard P. Mount SLAC SC2003 Experimental HENP Understanding the quantum world requires: Repeated measurement billions of collisions Large (500 2000 physicist)

More information

A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017

A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017 A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council Perth, July 31-Aug 01, 2017 http://levlafayette.com Necessary and Sufficient Definitions High Performance Computing: High

More information

INDEXING OF ATLAS DATA MANAGEMENT AND ANALYSIS SYSTEM

INDEXING OF ATLAS DATA MANAGEMENT AND ANALYSIS SYSTEM INDEXING OF ATLAS DATA MANAGEMENT AND ANALYSIS SYSTEM METADATA M.A. Grigoryeva 1,2,а, M.V. Golosova 1, A.A. Klimentov 3, M.S. Borodin 4, A.A. Alekseev 2, I.A. Tkachenko 1 1 National Research Center "Kurchatov

More information

Striped Data Server for Scalable Parallel Data Analysis

Striped Data Server for Scalable Parallel Data Analysis Journal of Physics: Conference Series PAPER OPEN ACCESS Striped Data Server for Scalable Parallel Data Analysis To cite this article: Jin Chang et al 2018 J. Phys.: Conf. Ser. 1085 042035 View the article

More information

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products What is big data? Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.

More information

Grid Computing Activities at KIT

Grid Computing Activities at KIT Grid Computing Activities at KIT Meeting between NCP and KIT, 21.09.2015 Manuel Giffels Karlsruhe Institute of Technology Institute of Experimental Nuclear Physics & Steinbuch Center for Computing Courtesy

More information

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017 Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle

More information

CC-IN2P3: A High Performance Data Center for Research

CC-IN2P3: A High Performance Data Center for Research April 15 th, 2011 CC-IN2P3: A High Performance Data Center for Research Toward a partnership with DELL Dominique Boutigny Agenda Welcome Introduction to CC-IN2P3 Visit of the computer room Lunch Discussion

More information

Experience of the WLCG data management system from the first two years of the LHC data taking

Experience of the WLCG data management system from the first two years of the LHC data taking Experience of the WLCG data management system from the first two years of the LHC data taking 1 Nuclear Physics Institute, Czech Academy of Sciences Rez near Prague, CZ 25068, Czech Republic E-mail: adamova@ujf.cas.cz

More information

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION

More information

Data Transfers Between LHC Grid Sites Dorian Kcira

Data Transfers Between LHC Grid Sites Dorian Kcira Data Transfers Between LHC Grid Sites Dorian Kcira dkcira@caltech.edu Caltech High Energy Physics Group hep.caltech.edu/cms CERN Site: LHC and the Experiments Large Hadron Collider 27 km circumference

More information

Storage and I/O requirements of the LHC experiments

Storage and I/O requirements of the LHC experiments Storage and I/O requirements of the LHC experiments Sverre Jarp CERN openlab, IT Dept where the Web was born 22 June 2006 OpenFabrics Workshop, Paris 1 Briefly about CERN 22 June 2006 OpenFabrics Workshop,

More information

CHEP 2013 October Amsterdam K De D Golubkov A Klimentov M Potekhin A Vaniachine

CHEP 2013 October Amsterdam K De D Golubkov A Klimentov M Potekhin A Vaniachine Task Management in the New ATLAS Production System CHEP 2013 October 14-18 K De D Golubkov A Klimentov M Potekhin A Vaniachine on behalf of the ATLAS Collaboration Overview The ATLAS Production System

More information

HPC Architectures. Types of resource currently in use

HPC Architectures. Types of resource currently in use HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM Lukas Nellen ICN-UNAM lukas@nucleares.unam.mx 3rd BigData BigNetworks Conference Puerto Vallarta April 23, 2015 Who Am I? ALICE

More information

Experience with ATLAS MySQL PanDA database service

Experience with ATLAS MySQL PanDA database service Journal of Physics: Conference Series Experience with ATLAS MySQL PanDA database service To cite this article: Y Smirnov et al 2010 J. Phys.: Conf. Ser. 219 042059 View the article online for updates and

More information

UW-ATLAS Experiences with Condor

UW-ATLAS Experiences with Condor UW-ATLAS Experiences with Condor M.Chen, A. Leung, B.Mellado Sau Lan Wu and N.Xu Paradyn / Condor Week, Madison, 05/01/08 Outline Our first success story with Condor - ATLAS production in 2004~2005. CRONUS

More information

Explore multi core virtualization on the project

Explore multi core virtualization on the project Explore multi core virtualization on the ATLAS@home project 1 IHEP 19B Yuquan Road, Beijing, 100049 China E-mail:wuwj@ihep.ac.cn David Cameron 2 Department of Physics, University of Oslo P.b. 1048 Blindern,

More information

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch Preparing for High-Luminosity LHC Bob Jones CERN Bob.Jones cern.ch The Mission of CERN Push back the frontiers of knowledge E.g. the secrets of the Big Bang what was the matter like within the first

More information

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development Jeremy Fischer Indiana University 9 September 2014 Citation: Fischer, J.L. 2014. ACCI Recommendations on Long Term

More information

Monitoring of Computing Resource Use of Active Software Releases at ATLAS

Monitoring of Computing Resource Use of Active Software Releases at ATLAS 1 2 3 4 5 6 Monitoring of Computing Resource Use of Active Software Releases at ATLAS Antonio Limosani on behalf of the ATLAS Collaboration CERN CH-1211 Geneva 23 Switzerland and University of Sydney,

More information

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino Monitoring system for geographically distributed datacenters based on Openstack Gioacchino Vino Tutor: Dott. Domenico Elia Tutor: Dott. Giacinto Donvito Borsa di studio GARR Orio Carlini 2016-2017 INFN

More information

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February

13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February LHC Cloud Computing with CernVM Ben Segal 1 CERN 1211 Geneva 23, Switzerland E mail: b.segal@cern.ch Predrag Buncic CERN E mail: predrag.buncic@cern.ch 13th International Workshop on Advanced Computing

More information

HammerCloud: A Stress Testing System for Distributed Analysis

HammerCloud: A Stress Testing System for Distributed Analysis HammerCloud: A Stress Testing System for Distributed Analysis Daniel C. van der Ster 1, Johannes Elmsheuser 2, Mario Úbeda García 1, Massimo Paladin 1 1: CERN, Geneva, Switzerland 2: Ludwig-Maximilians-Universität

More information

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers. WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers. J Andreeva 1, A Beche 1, S Belov 2, I Kadochnikov 2, P Saiz 1 and D Tuckett 1 1 CERN (European Organization for Nuclear

More information

ISTITUTO NAZIONALE DI FISICA NUCLEARE

ISTITUTO NAZIONALE DI FISICA NUCLEARE ISTITUTO NAZIONALE DI FISICA NUCLEARE Sezione di Perugia INFN/TC-05/10 July 4, 2005 DESIGN, IMPLEMENTATION AND CONFIGURATION OF A GRID SITE WITH A PRIVATE NETWORK ARCHITECTURE Leonello Servoli 1,2!, Mirko

More information

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model Journal of Physics: Conference Series The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model To cite this article: S González de la Hoz 2012 J. Phys.: Conf. Ser. 396 032050

More information

Spark and HPC for High Energy Physics Data Analyses

Spark and HPC for High Energy Physics Data Analyses Spark and HPC for High Energy Physics Data Analyses Marc Paterno, Jim Kowalkowski, and Saba Sehrish 2017 IEEE International Workshop on High-Performance Big Data Computing Introduction High energy physics

More information

Development of DKB ETL module in case of data conversion

Development of DKB ETL module in case of data conversion Journal of Physics: Conference Series PAPER OPEN ACCESS Development of DKB ETL module in case of data conversion To cite this article: A Y Kaida et al 2018 J. Phys.: Conf. Ser. 1015 032055 View the article

More information

ATLAS 実験コンピューティングの現状と将来 - エクサバイトへの挑戦 坂本宏 東大 ICEPP

ATLAS 実験コンピューティングの現状と将来 - エクサバイトへの挑戦 坂本宏 東大 ICEPP ATLAS 実験コンピューティングの現状と将来 - エクサバイトへの挑戦 坂本宏 東大 ICEPP 1 Contents Energy Frontier Particle Physics Large Hadron Collider (LHC) LHC Experiments: mainly ATLAS Requirements on computing Worldwide LHC Computing

More information

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber NERSC Site Update National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory Richard Gerber NERSC Senior Science Advisor High Performance Computing Department Head Cori

More information

Improved ATLAS HammerCloud Monitoring for Local Site Administration

Improved ATLAS HammerCloud Monitoring for Local Site Administration Improved ATLAS HammerCloud Monitoring for Local Site Administration M Böhler 1, J Elmsheuser 2, F Hönig 2, F Legger 2, V Mancinelli 3, and G Sciacca 4 on behalf of the ATLAS collaboration 1 Albert-Ludwigs

More information

LHCb Computing Strategy

LHCb Computing Strategy LHCb Computing Strategy Nick Brook Computing Model 2008 needs Physics software Harnessing the Grid DIRC GNG Experience & Readiness HCP, Elba May 07 1 Dataflow RW data is reconstructed: e.g. Calo. Energy

More information

Introduction to HPC Parallel I/O

Introduction to HPC Parallel I/O Introduction to HPC Parallel I/O Feiyi Wang (Ph.D.) and Sarp Oral (Ph.D.) Technology Integration Group Oak Ridge Leadership Computing ORNL is managed by UT-Battelle for the US Department of Energy Outline

More information

Data Management for the World s Largest Machine

Data Management for the World s Largest Machine Data Management for the World s Largest Machine Sigve Haug 1, Farid Ould-Saada 2, Katarina Pajchel 2, and Alexander L. Read 2 1 Laboratory for High Energy Physics, University of Bern, Sidlerstrasse 5,

More information

Travelling securely on the Grid to the origin of the Universe

Travelling securely on the Grid to the origin of the Universe 1 Travelling securely on the Grid to the origin of the Universe F-Secure SPECIES 2007 conference Wolfgang von Rüden 1 Head, IT Department, CERN, Geneva 24 January 2007 2 CERN stands for over 50 years of

More information

Storage Resource Sharing with CASTOR.

Storage Resource Sharing with CASTOR. Storage Resource Sharing with CASTOR Olof Barring, Benjamin Couturier, Jean-Damien Durand, Emil Knezo, Sebastien Ponce (CERN) Vitali Motyakov (IHEP) ben.couturier@cern.ch 16/4/2004 Storage Resource Sharing

More information

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center

More information

PARALLEL PROCESSING OF LARGE DATA SETS IN PARTICLE PHYSICS

PARALLEL PROCESSING OF LARGE DATA SETS IN PARTICLE PHYSICS PARALLEL PROCESSING OF LARGE DATA SETS IN PARTICLE PHYSICS MARINA ROTARU 1, MIHAI CIUBĂNCAN 1, GABRIEL STOICEA 1 1 Horia Hulubei National Institute for Physics and Nuclear Engineering, Reactorului 30,

More information

ATLAS Computing: the Run-2 experience

ATLAS Computing: the Run-2 experience ATLAS Computing: the Run-2 experience Fernando Barreiro Megino on behalf of ATLAS Distributed Computing KEK, 4 April 2017 About me SW Engineer (2004) and Telecommunications Engineer (2007), Universidad

More information

Volunteer Computing at CERN

Volunteer Computing at CERN Volunteer Computing at CERN BOINC workshop Sep 2014, Budapest Tomi Asp & Pete Jones, on behalf the LHC@Home team Agenda Overview Status of the LHC@Home projects Additional BOINC projects Service consolidation

More information

ATLAS distributed computing: experience and evolution

ATLAS distributed computing: experience and evolution Journal of Physics: Conference Series OPEN ACCESS ATLAS distributed computing: experience and evolution To cite this article: A Nairz and the Atlas Collaboration 2014 J. Phys.: Conf. Ser. 523 012020 View

More information

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID The WLCG Motivation and benefits Container engines Experiments status and plans Security considerations Summary and outlook STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID SWISS EXPERIENCE

More information

PERFORMING TRACK RECONSTRUCTION AT THE ALICE TPC USING A FAST HOUGH TRANSFORM METHOD

PERFORMING TRACK RECONSTRUCTION AT THE ALICE TPC USING A FAST HOUGH TRANSFORM METHOD Ó³ Ÿ. 2016.. 13, º 5(203).. 1020Ä1027 Š Œ œ ƒˆˆ ˆ ˆŠ PERFORMING TRACK RECONSTRUCTION AT THE ALICE TPC USING A FAST HOUGH TRANSFORM METHOD C. S. Kouzinopoulos 1, P. Hristov 2 CERN, European Laboratory for

More information

Evolution of Database Replication Technologies for WLCG

Evolution of Database Replication Technologies for WLCG Evolution of Database Replication Technologies for WLCG Zbigniew Baranowski, Lorena Lobato Pardavila, Marcin Blaszczyk, Gancho Dimitrov, Luca Canali European Organisation for Nuclear Research (CERN), CH-1211

More information

Customer Success Story Los Alamos National Laboratory

Customer Success Story Los Alamos National Laboratory Customer Success Story Los Alamos National Laboratory Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory Case Study June 2010 Highlights First Petaflop

More information

Philippe Laurens, Michigan State University, for USATLAS. Atlas Great Lakes Tier 2 collocated at MSU and the University of Michigan

Philippe Laurens, Michigan State University, for USATLAS. Atlas Great Lakes Tier 2 collocated at MSU and the University of Michigan Philippe Laurens, Michigan State University, for USATLAS Atlas Great Lakes Tier 2 collocated at MSU and the University of Michigan ESCC/Internet2 Joint Techs -- 12 July 2011 Content Introduction LHC, ATLAS,

More information

CouchDB-based system for data management in a Grid environment Implementation and Experience

CouchDB-based system for data management in a Grid environment Implementation and Experience CouchDB-based system for data management in a Grid environment Implementation and Experience Hassen Riahi IT/SDC, CERN Outline Context Problematic and strategy System architecture Integration and deployment

More information

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT The Monte Carlo Method: Versatility Unbounded in a Dynamic Computing World Chattanooga, Tennessee, April 17-21, 2005, on CD-ROM, American Nuclear Society, LaGrange Park, IL (2005) MONTE CARLO SIMULATION

More information

CernVM-FS beyond LHC computing

CernVM-FS beyond LHC computing CernVM-FS beyond LHC computing C Condurache, I Collier STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot, OX11 0QX, UK E-mail: catalin.condurache@stfc.ac.uk Abstract. In the last three years

More information

Transient Compute ARC as Cloud Front-End

Transient Compute ARC as Cloud Front-End Digital Infrastructures for Research 2016 2016-09-29, 11:30, Cracow 30 min slot AEC ALBERT EINSTEIN CENTER FOR FUNDAMENTAL PHYSICS Transient Compute ARC as Cloud Front-End Sigve Haug, AEC-LHEP University

More information

Summary of the LHC Computing Review

Summary of the LHC Computing Review Summary of the LHC Computing Review http://lhc-computing-review-public.web.cern.ch John Harvey CERN/EP May 10 th, 2001 LHCb Collaboration Meeting The Scale Data taking rate : 50,100, 200 Hz (ALICE, ATLAS-CMS,

More information

Popularity Prediction Tool for ATLAS Distributed Data Management

Popularity Prediction Tool for ATLAS Distributed Data Management Popularity Prediction Tool for ATLAS Distributed Data Management T Beermann 1,2, P Maettig 1, G Stewart 2, 3, M Lassnig 2, V Garonne 2, M Barisits 2, R Vigne 2, C Serfon 2, L Goossens 2, A Nairz 2 and

More information

Future trends in distributed infrastructures the Nordic Tier-1 example

Future trends in distributed infrastructures the Nordic Tier-1 example Future trends in distributed infrastructures the Nordic Tier-1 example O. G. Smirnova 1,2 1 Lund University, 1, Professorsgatan, Lund, 22100, Sweden 2 NeIC, 25, Stensberggata, Oslo, NO-0170, Norway E-mail:

More information

BUILDING the VIRtUAL enterprise

BUILDING the VIRtUAL enterprise BUILDING the VIRTUAL ENTERPRISE A Red Hat WHITEPAPER www.redhat.com As an IT shop or business owner, your ability to meet the fluctuating needs of your business while balancing changing priorities, schedules,

More information

Grid Computing a new tool for science

Grid Computing a new tool for science Grid Computing a new tool for science CERN, the European Organization for Nuclear Research Dr. Wolfgang von Rüden Wolfgang von Rüden, CERN, IT Department Grid Computing July 2006 CERN stands for over 50

More information

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department

More information

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010 Worldwide Production Distributed Data Management at the LHC Brian Bockelman MSST 2010, 4 May 2010 At the LHC http://op-webtools.web.cern.ch/opwebtools/vistar/vistars.php?usr=lhc1 Gratuitous detector pictures:

More information

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium Storage on the Lunatic Fringe Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium tmruwart@dtc.umn.edu Orientation Who are the lunatics? What are their requirements?

More information

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid

More information